Multi-Class Image Classification Model for American Sign Language Alphabet Using TensorFlow Take 2¶

David Lowe¶

November 8, 2022¶

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The American Sign Language Alphabet Dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: The dataset contains over 22,000 images of alphabets from American Sign Language, separated into 29 folders that represent the various classes. The research team collected these images to investigate the possibilities of reducing the communication gap between sign-language users and non-Sign language users.

ANALYSIS: The EfficientNetV2M model's performance achieved an accuracy score of 99.01% after three epochs using the training dataset. When we applied the model to the validation dataset, the model achieved an accuracy score of 88.47%.

CONCLUSION: In this iteration, the TensorFlow EfficientNetV2M CNN model appeared suitable for modeling this dataset.

Dataset ML Model: Multi-Class classification with numerical features

Dataset Used: ASL(American Sign Language) Alphabet Dataset

Dataset Reference: https://www.kaggle.com/datasets/debashishsau/aslamerican-sign-language-aplhabet-dataset

One source of potential performance benchmarks: https://www.kaggle.com/datasets/debashishsau/aslamerican-sign-language-aplhabet-dataset/code

Task 1 - Prepare Environment¶

In [1]:
# Retrieve CPU information from the system
ncpu = !nproc
print("The number of available CPUs is:", ncpu[0])
The number of available CPUs is: 2
In [2]:
# Retrieve memory configuration information
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))
Your runtime has 13.6 gigabytes of available RAM

In [3]:
# Retrieve GPU configuration information
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
print(gpu_info)
Fri Nov  4 23:44:58 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   46C    P8    10W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

1.a) Load libraries and modules¶

In [4]:
# Set the random seed number for reproducible results
RNG_SEED = 888
In [5]:
import random
random.seed(RNG_SEED)
import numpy as np
np.random.seed(RNG_SEED)
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import os
import sys
import math
# import boto3
import zipfile
from datetime import datetime
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score

import tensorflow as tf
tf.random.set_seed(RNG_SEED)
from tensorflow import keras
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.preprocessing.image import ImageDataGenerator

1.b) Set up the controlling parameters and functions¶

In [6]:
# Begin the timer for the script processing
START_TIME_SCRIPT = datetime.now()
In [7]:
# Set up the number of CPU cores available for multi-thread processing
N_JOBS = 1

# Set up the flag to stop sending progress emails (setting to True will send status emails!)
NOTIFY_STATUS = False

# Set the percentage sizes for splitting the dataset
TEST_SET_RATIO = 0.2
VAL_SET_RATIO = 0.2

# Set the number of folds for cross validation
N_FOLDS = 5
N_ITERATIONS = 1

# Set various default modeling parameters
DEFAULT_LOSS = 'categorical_crossentropy'
DEFAULT_METRICS = ['accuracy']
DEFAULT_OPTIMIZER = tf.keras.optimizers.Adam(learning_rate=0.0001)
CLASSIFIER_ACTIVATION = 'softmax'
MAX_EPOCHS = 3
BATCH_SIZE = 16
# CLASS_LABELS = []
# CLASS_NAMES = []
# RAW_IMAGE_SIZE = (250, 250)
TARGET_IMAGE_SIZE = (299, 299)
INPUT_IMAGE_SHAPE = (TARGET_IMAGE_SIZE[0], TARGET_IMAGE_SIZE[1], 3)

# Define the labels to use for graphing the data
TRAIN_METRIC = "accuracy"
VALIDATION_METRIC = "val_accuracy"
TRAIN_LOSS = "loss"
VALIDATION_LOSS = "val_loss"

# Define the directory locations and file names
STAGING_DIR = 'staging/'
TRAIN_DIR = 'staging/ASL_Alphabet_Dataset/asl_alphabet_train'
# VALID_DIR = ''
TEST_DIR = 'staging/ASL_Alphabet_Dataset/asl_alphabet_test'
TRAIN_DATASET = 'archive.zip'
# VALID_DATASET = ''
# TEST_DATASET = ''
# TRAIN_LABELS = ''
# VALID_LABELS = ''
# TEST_LABELS = ''
# OUTPUT_DIR = 'staging/'
# SAMPLE_SUBMISSION_CSV = 'sample_submission.csv'
# FINAL_SUBMISSION_CSV = 'submission.csv'

# Check the number of GPUs accessible through TensorFlow
print('Num GPUs Available:', len(tf.config.list_physical_devices('GPU')))

# Print out the TensorFlow version for confirmation
print('TensorFlow version:', tf.__version__)
Num GPUs Available: 1
TensorFlow version: 2.9.2
In [8]:
# Set up the email notification function
def status_notify(msg_text):
    access_key = os.environ.get('SNS_ACCESS_KEY')
    secret_key = os.environ.get('SNS_SECRET_KEY')
    aws_region = os.environ.get('SNS_AWS_REGION')
    topic_arn = os.environ.get('SNS_TOPIC_ARN')
    if (access_key is None) or (secret_key is None) or (aws_region is None):
        sys.exit("Incomplete notification setup info. Script Processing Aborted!!!")
    sns = boto3.client('sns', aws_access_key_id=access_key, aws_secret_access_key=secret_key, region_name=aws_region)
    response = sns.publish(TopicArn=topic_arn, Message=msg_text)
    if response['ResponseMetadata']['HTTPStatusCode'] != 200 :
        print('Status notification not OK with HTTP status code:', response['ResponseMetadata']['HTTPStatusCode'])
In [9]:
if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 1 - Prepare Environment completed on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))

Task 2 - Load and Prepare Images¶

In [10]:
if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 2 - Load and Prepare Images has begun on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))
In [11]:
# Clean up the old files and download directories before receiving new ones
!rm -rf staging/

if not os.path.exists(TRAIN_DATASET):
    !wget https://dainesanalytics.com/datasets/kaggle-debashishsau-sign-language-aplhabet/archive.zip

zip_ref = zipfile.ZipFile(TRAIN_DATASET, 'r')
zip_ref.extractall(STAGING_DIR)
zip_ref.close()
--2022-11-04 23:45:06--  https://dainesanalytics.com/datasets/kaggle-debashishsau-sign-language-aplhabet/archive.zip
Resolving dainesanalytics.com (dainesanalytics.com)... 13.226.52.82, 13.226.52.21, 13.226.52.22, ...
Connecting to dainesanalytics.com (dainesanalytics.com)|13.226.52.82|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4508201844 (4.2G) [application/zip]
Saving to: ‘archive.zip’

archive.zip         100%[===================>]   4.20G  49.7MB/s    in 95s     

2022-11-04 23:46:41 (45.5 MB/s) - ‘archive.zip’ saved [4508201844/4508201844]

In [12]:
CLASS_LABELS = os.listdir(TRAIN_DIR)
print(CLASS_LABELS)
NUM_CLASSES = len(CLASS_LABELS)
print('Total number of classes detected:', NUM_CLASSES)
['H', 'T', 'S', 'R', 'V', 'N', 'F', 'space', 'W', 'D', 'Q', 'X', 'J', 'M', 'I', 'B', 'E', 'nothing', 'Y', 'C', 'U', 'A', 'G', 'P', 'O', 'Z', 'del', 'L', 'K']
Total number of classes detected: 29
In [13]:
# Brief listing of training image files for each class
for c_label in CLASS_LABELS:
    training_class_dir = os.path.join(TRAIN_DIR, c_label)
    training_class_files = os.listdir(training_class_dir)
    print('Number of training images for', c_label, ':', len(os.listdir(training_class_dir)))
    print('Training samples for', c_label, ':', training_class_files[:5],'\n')
Number of training images for H : 7906
Training samples for H : ['H2899.jpg', 'H (3702).jpg', 'H (399).jpg', 'H (2657).jpg', 'H26.jpg'] 

Number of training images for T : 8054
Training samples for T : ['T21.jpg', 'T2027.jpg', 't_40_rotate_9.jpeg', 'T (2531).jpg', 't_34_rotate_7.jpeg'] 

Number of training images for S : 8109
Training samples for S : ['S (1687).jpg', 'S (3113).jpg', 's_46_rotate_8.jpeg', 's_14_rotate_5.jpeg', 'S2780.jpg'] 

Number of training images for R : 8021
Training samples for R : ['R2942.jpg', 'R2948.jpg', 'R (1118).jpg', 'r_7_rotate_8.jpeg', 'R (2692).jpg'] 

Number of training images for V : 7597
Training samples for V : ['V (928).jpg', 'V (3496).jpg', 'V (3505).jpg', 'V (2662).jpg', 'V (2228).jpg'] 

Number of training images for N : 7932
Training samples for N : ['N1300.jpg', 'N (3434).jpg', 'n_55_rotate_5.jpeg', 'N (3453).jpg', 'N1317.jpg'] 

Number of training images for F : 8031
Training samples for F : ['F (912).jpg', 'F2219.jpg', 'F (3179).jpg', 'F (1555).jpg', 'F (3089).jpg'] 

Number of training images for space : 7071
Training samples for space : ['space2050.jpg', 'space2560.jpg', 'space (2029).jpg', 'space831.jpg', 'space (2691).jpg'] 

Number of training images for W : 7787
Training samples for W : ['W352.jpg', 'w_35_rotate_9.jpeg', 'w_53_rotate_9.jpeg', 'W (1762).jpg', 'W (3090).jpg'] 

Number of training images for D : 7629
Training samples for D : ['D (1864).jpg', 'D (2058).jpg', 'D2653.jpg', 'D (202).jpg', 'D2665.jpg'] 

Number of training images for Q : 7954
Training samples for Q : ['Q1212.jpg', 'Q383.jpg', 'Q (619).jpg', 'Q (2662).jpg', 'q_4_rotate_2.jpeg'] 

Number of training images for X : 8093
Training samples for X : ['x_52_rotate_2.jpeg', 'X (349).jpg', 'X (1045).jpg', 'X2470.jpg', '91.jpg'] 

Number of training images for J : 7503
Training samples for J : ['J1472.jpg', 'J (2817).jpg', 'j_53_rotate_8.jpeg', 'J (638).jpg', 'J (3000).jpg'] 

Number of training images for M : 7900
Training samples for M : ['M166.jpg', 'M361.jpg', 'M (1761).jpg', 'm_50_rotate_8.jpeg', 'M1929.jpg'] 

Number of training images for I : 7953
Training samples for I : ['I1290.jpg', 'I (1281).jpg', 'I (3090).jpg', 'I2815.jpg', '91.jpg'] 

Number of training images for B : 8309
Training samples for B : ['B (3114).jpg', 'B (540).jpg', 'B (3019).jpg', '91.jpg', 'B (1659).jpg'] 

Number of training images for E : 7744
Training samples for E : ['e_18_rotate_5.jpeg', 'E (2086).jpg', 'E (373).jpg', 'E (1266).jpg', 'E (1267).jpg'] 

Number of training images for nothing : 3030
Training samples for nothing : ['nothing1317.jpg', 'nothing1916.jpg', 'nothing1454.jpg', 'nothing350.jpg', 'nothing2532.jpg'] 

Number of training images for Y : 8178
Training samples for Y : ['Y (1133).jpg', 'Y (3133).jpg', 'Y337.jpg', 'Y1241.jpg', 'Y (1636).jpg'] 

Number of training images for C : 8146
Training samples for C : ['C (510).jpg', 'C (439).jpg', 'C (3820).jpg', '91.jpg', 'C1602.jpg'] 

Number of training images for U : 8023
Training samples for U : ['U (1760).jpg', 'U (3751).jpg', 'U (2389).jpg', 'U1109.jpg', 'U650.jpg'] 

Number of training images for A : 8458
Training samples for A : ['A (1725).jpg', 'A2510.jpg', 'A (3046).jpg', 'A (2274).jpg', 'A (2428).jpg'] 

Number of training images for G : 7844
Training samples for G : ['G141.jpg', 'g_52_rotate_5.jpeg', 'G1316.jpg', 'G1799.jpg', 'G (592).jpg'] 

Number of training images for P : 7601
Training samples for P : ['p_53_rotate_2.jpeg', 'P1709.jpg', 'P2246.jpg', 'P (3294).jpg', 'P (1054).jpg'] 

Number of training images for O : 8140
Training samples for O : ['O2204.jpg', 'O1913.jpg', 'O1411.jpg', 'O (2872).jpg', 'O (3477).jpg'] 

Number of training images for Z : 7410
Training samples for Z : ['Z (2870).jpg', 'Z (2161).jpg', 'Z2911.jpg', 'Z224.jpg', 'Z (1923).jpg'] 

Number of training images for del : 6836
Training samples for del : ['del (3724).jpg', 'del1239.jpg', 'del1380.jpg', 'del (1332).jpg', 'del (2381).jpg'] 

Number of training images for L : 7939
Training samples for L : ['L (2460).jpg', 'l_52_rotate_4.jpeg', '91.jpg', 'L (2447).jpg', 'L (337).jpg'] 

Number of training images for K : 7876
Training samples for K : ['K1978.jpg', 'K220.jpg', 'K (2452).jpg', 'K (925).jpg', '91.jpg'] 

In [14]:
# Plot some training images from the dataset
nrows = len(CLASS_LABELS)
ncols = 4
training_examples = []
example_labels = []

fig = plt.gcf()
fig.set_size_inches(ncols * 4, nrows * 3)

for c_label in CLASS_LABELS:
    training_class_dir = os.path.join(TRAIN_DIR, c_label)
    training_class_files = os.listdir(training_class_dir)
    for j in range(ncols):
        training_examples.append(training_class_dir + '/' + training_class_files[j])
        example_labels.append(c_label)
    # print(training_examples)
    # print(example_labels)

for i, img_path in enumerate(training_examples):
    # Set up subplot; subplot indices start at 1
    sp = plt.subplot(nrows, ncols, i+1)
    sp.text(0, 0, example_labels[i])
    # sp.axis('Off')
    img = mpimg.imread(img_path)
    plt.imshow(img)
plt.show()
In [15]:
datagen_kwargs = dict(rescale=1./255, validation_split=VAL_SET_RATIO)
training_datagen = ImageDataGenerator(**datagen_kwargs)
validation_datagen = ImageDataGenerator(**datagen_kwargs)
dataflow_kwargs = dict(class_mode="categorical")

do_data_augmentation = True
if do_data_augmentation:
    training_datagen = ImageDataGenerator(rotation_range=45,
                                          horizontal_flip=True,
                                          vertical_flip=True,
                                          **datagen_kwargs)

print('Loading and pre-processing the training images...')
training_generator = training_datagen.flow_from_directory(directory=TRAIN_DIR,
                                                          target_size=TARGET_IMAGE_SIZE,
                                                          batch_size=BATCH_SIZE,
                                                          shuffle=True,
                                                          seed=RNG_SEED,
                                                          subset="training",
                                                          **dataflow_kwargs)
print('Number of training image batches per epoch of modeling:', len(training_generator))

print('Loading and pre-processing the validation images...')
validation_generator = validation_datagen.flow_from_directory(directory=TRAIN_DIR,
                                                              target_size=TARGET_IMAGE_SIZE,
                                                              batch_size=BATCH_SIZE,
                                                              shuffle=False,
                                                              subset="validation",
                                                              **dataflow_kwargs)
print('Number of validation image batches per epoch of modeling:', len(validation_generator))
Loading and pre-processing the training images...
Found 178472 images belonging to 29 classes.
Number of training image batches per epoch of modeling: 11155
Loading and pre-processing the validation images...
Found 44602 images belonging to 29 classes.
Number of validation image batches per epoch of modeling: 2788
In [16]:
if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 2 - Load and Prepare Images completed on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))

Task 3 - Define and Train Models¶

In [17]:
if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 3 - Define and Train Models has begun on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))
In [18]:
# Define the function for plotting training results for comparison
def plot_metrics(history):
    fig, axs = plt.subplots(1, 2, figsize=(24, 15))
    metrics =  [TRAIN_LOSS, TRAIN_METRIC]
    for n, metric in enumerate(metrics):
        name = metric.replace("_"," ").capitalize()
        plt.subplot(2,2,n+1)
        plt.plot(history.epoch, history.history[metric], color='blue', label='Train')
        plt.plot(history.epoch, history.history['val_'+metric], color='red', linestyle="--", label='Val')
        plt.xlabel('Epoch')
        plt.ylabel(name)
        if metric == TRAIN_LOSS:
            plt.ylim([0, plt.ylim()[1]])
        else:
            plt.ylim([0, 1])
        plt.legend()
In [19]:
# Define the baseline model for benchmarking
def create_nn_model(input_param=INPUT_IMAGE_SHAPE, output_param=NUM_CLASSES, dense_nodes=2048,
                    classifier_activation=CLASSIFIER_ACTIVATION, loss_param=DEFAULT_LOSS,
                    opt_param=DEFAULT_OPTIMIZER, metrics_param=DEFAULT_METRICS):
    base_model = keras.applications.efficientnet_v2.EfficientNetV2M(include_top=False, weights='imagenet', input_shape=input_param)
    nn_model = keras.models.Sequential()
    nn_model.add(base_model)
    nn_model.add(keras.layers.Flatten())
    nn_model.add(keras.layers.Dense(dense_nodes, activation='relu')),
    nn_model.add(keras.layers.Dense(output_param, activation=classifier_activation))
    nn_model.compile(loss=loss_param, optimizer=opt_param, metrics=metrics_param)
    return nn_model
In [20]:
# Initialize the neural network model and get the training results for plotting graph
start_time_module = datetime.now()
tf.keras.utils.set_random_seed(RNG_SEED)
baseline_model = create_nn_model()
baseline_model_history = baseline_model.fit(training_generator,
                                            epochs=MAX_EPOCHS,
                                            validation_data=validation_generator,
                                            verbose=1)
print('Total time for model fitting:', (datetime.now() - start_time_module))
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/efficientnet_v2/efficientnetv2-m_notop.h5
214201816/214201816 [==============================] - 1s 0us/step
Epoch 1/3
11155/11155 [==============================] - 9958s 889ms/step - loss: 0.2409 - accuracy: 0.9306 - val_loss: 0.5858 - val_accuracy: 0.8881
Epoch 2/3
11155/11155 [==============================] - 9896s 887ms/step - loss: 0.0568 - accuracy: 0.9846 - val_loss: 1.1717 - val_accuracy: 0.7889
Epoch 3/3
11155/11155 [==============================] - 9903s 888ms/step - loss: 0.0371 - accuracy: 0.9901 - val_loss: 0.7385 - val_accuracy: 0.8847
Total time for model fitting: 8:17:01.646359
In [21]:
baseline_model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 efficientnetv2-m (Functiona  (None, 10, 10, 1280)     53150388  
 l)                                                              
                                                                 
 flatten (Flatten)           (None, 128000)            0         
                                                                 
 dense (Dense)               (None, 2048)              262146048 
                                                                 
 dense_1 (Dense)             (None, 29)                59421     
                                                                 
=================================================================
Total params: 315,355,857
Trainable params: 315,063,825
Non-trainable params: 292,032
_________________________________________________________________
In [22]:
plot_metrics(baseline_model_history)
In [23]:
if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 3 - Define and Train Models completed on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))

Task 4 - Tune and Optimize Models¶

In [24]:
# if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 4 - Tune and Optimize Models has begun on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))
In [25]:
# Not applicable for this iteration of modeling
In [26]:
# if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 4 - Tune and Optimize Models completed on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))

Task 5 - Finalize Model and Make Predictions¶

In [27]:
# if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 5 - Finalize Model and Make Predictions has begun on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))
In [28]:
# Not applicable for this iteration of modeling
In [29]:
# if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 5 - Finalize Model and Make Predictions completed on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))
In [30]:
print ('Total time for the script:',(datetime.now() - START_TIME_SCRIPT))
Total time for the script: 8:20:07.844593